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Abstract: 

Two new algorithms are described for matching two dimensional coordinate lists of point sources 
that are significantly faster than previous methods. By matching rarely occurring triangles (or more 
complex shapes) in the two lists, and by ordering searches by decreasing probability of success, it 
is demonstrated that very few candidates need be considered to find a successful match. Moreover, 
by immediately testing the suitability of a potential match using an efficient mechanism, the need to 
process the entire candidate set is avoided, yielding considerable performance improvements. Triangles 
are described by a cosine metric that reduces the density of triangle space, permitting efficient searches. 
An alternative shape characterization method that reduces computational overhead in the construction 
phase is discussed. The algorithms are tested on a set of 10 063 wide-field survey images, with fields- 
of-view up to 4.8°x 3.6°, successfully matching 100% of the images in a mean elapsed time of 6 ms (2.4 
GHz Athlon CPU). The elapsed time of the searching phase is shown to vary by less than 1 ms for list 
sizes between 10 and 200 points, demonstrating that fast, robust searches may be completed in nearly 
constant time, independent of list size. 

Keywords: astrometry — methods: data analysis — surveys 



1 Introduction 

In the course of carrying out a wide-field CCD imaging 
survey, two new methods for correlating the images to 
star catalogues have been developed, motivated by the 
need to efficiently handle the large number of stellar 
sources present on the images. Most previously pub- 
lished algorithms successfully cater for small lists (< 
50 stars), but do not scale well to wide-fields contain- 
ing 10 3 or more stellar sources. 

The problem of matching coordinate lists of point 
sources is a necessary prerequisite for deriving an as- 
trometric plate solution. The objective is to match a 
subset of stars found on an image to their correspond- 
ing entries in a stellar catalogue in order to determine 
the transformation between detector coordinates and 
sky coordinates. The algorithm must handle transla- 
tion and rotation, and small changes in scale caused 
by temperature related changes in focal length. In ad- 
dition, it must cope with additional and missing stars. 
That is, the two lists may only partially overlap. 

The efficiency of the algorithm is of paramount 
concern, since it is embodied within the closed-loop 
pointing system of the telescope and therefore affects 
the duty-cycle time, and ultimately constrains the num- 
ber of images that can be acquired each night. Surveys 
that require very high photometric precision typically 
seek to accurately align their fields on the same detec- 
tor pix els each night to overco me residual flat-fielding 
errors (jEverett fc Howell200j) . and would benefit from 
the efficiency gains of a fast matching algorithm. Sim- 
ilarly, h igh cadence survey s, such as the Southern Sky 
survey (jKeller et al.ll2007T l could improve precision and 
reduce its duty-cycle by utilizing a fast closed-loop 
pointing algorithm. Moreover, real-time attitude ad- 



justments on spacecraft might be possible with the aid 
of an efficient matching algorithm to analyze on-b oard 
star camera images (see for example iFraserl [20031 ) . 

A number of al gorithms have been proposed to 
solve this problem. IGrothl (|l986T l describes an algo- 
rithm that matches geometrically similar shapes (tri- 
angles) in the two lists. By limiting the number of tri- 
angles constructed, and by only matching those trian- 
gles whose ratio of longest to shortest side are within a 
defined limit, his matching phase has a computational 
complexity o f Q(n 4 ' 5 ) where n is the number of stars 
in each list. IStetsori ljl990h describes a very similar 
algorithm that he developed independently at around 
the same time. 

iMurtaghl (|l992T l reviews a number of approaches 
and proposes his own, based upon characterization of 
a set of coordinates couples, with matching based on 
the proximity of feature vectors in the two lists. His 
method's matching phase has a computational com- 
plexity of 0(n 2 ). 

Nevertheless, Groth's algorithm appears to be the 
most widely accepted, with the methods applied across 
disciplines. For example, lArzoumanian et all ([20051 ) 
discuss its application to the problem of computer- 
aided identification of whale sharks, while Marszalc k fc Rokital 
(|2004r ). building upon the work of IGrothl (|l986l ). de- 
scribe an optimization to the voting phase of the algo- 
rithm, concluding that their method reduces the need 
for complicated filtering methods while successfully re- 
ducing the number of false matches. 

More recentlv. lPal fc Bakos (2006) describe another 
variation of triangle matching, optimized to handle 
large lists of objects extracted from wide field images. 
Large fields contain thousands of stars and pose a se- 
vere test for matching algorithms, requiring efficient 
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methods to accommodate the large number of point 
sources. 

The following sections discuss two new methods for 
pattern matching that have a matching phase with a 
complexity that is nearly O(l), at the cost of a slight 
loss in generality. They are collectively referred to as 
Optimistic Pattern Matching ( OPM) because they as- 
sume that (i) a good match is likely to be found, and 
(ii) the scale of the image is approximately known, thus 
permitting the use of an early exit strategy whereby 
only a small percentage of the candidate list is exam- 
ined. By contrast, previous methods assumed an un- 
known scale which required the entire candidate list to 
be processed to determine the most likely match using 
a statistical approach. This required additional phases 
and complexity. In practice, an a priori knowledge of 
an instrument's focal length is common place, and the 
use of a more general algorithm that assumes it is un- 
known mandates strategies that unnecessarily degrade 
performance. 

Section 2 describes the algorithms in detail. OPM a 
is based upon a new definition of triangle space, while 
OPMb uses an alternative shape characterization method. 
Section 3 tests their performance using a large sample 
of survey images and compares them to earlier meth- 
ods. Conclusions are summarized in Section 4. 

2 Algorithms 

The OPM algorithm has some similarity to previous 
algorithms in that it attempts to match triangles in the 
two lists. However, it differs fundamentally by search- 
ing for rarely occurring triangles that are unique (or 
nearly so) to the field. By ordering the triangles by 
their estimated selectivity, and by testing the rarest 
shapes first, a correct match in usually identified ex- 
tremely quickly. Thus, only a small fraction of the 
candidate list must be searched, allowing the search 
process to terminate early. 

2.1 List Creation 

The image for which a transformation is to be derived 
is first processed by a stellar detection routine to con- 
struct a list of sources ordered by descending magni- 
tude. Each star is assigned an approximate instru- 
mental magnitude estimated from the (non-sky sub- 
tracted) signal contained within the pixels attributed 
to the star. By assuming a uniform sky background 
and ignoring the effects of partial pixels, the method 
is computationally efficient in deriving an estimate of 
the relative intensity of the stellar sources found on 
the image. The brightest n stars are selected from the 
list to form the image star list, denoted as X. The ap- 
proximate equatorial coordinates of the field center are 
retrieved from the image header, together with the ap- 
proximate focal length of the optics and the detector's 
physical dimensions, allowing the field of view (FOV) 
to be estimated. Usin g these quantities, the Hubble 
Guide Star catalogue (lLasker et al.l Il990f ) is read to 
extract a list of the n brightest catalogue stars within 
the field. This list of reference stars is denoted by 1Z. 



The n brightest stars from each list are selected 
with the expectation that most will have a correspond- 
ing entry in the other list. However, experience shows 
that not all X will have a corresponding match in 1Z. 
Some uncertainty in the field center and, more im- 
portantly, differences in the passbands of the detec- 
tor and catalogue results in different stars being se- 
lected. Increasing the size of 1Z increases the proba- 
bility that more X will be matched, at the expense of 
a longer triangle construction phase. Unlike previous 
methods, increasing list sizes does not adversely affect 
OPM's matching performance in any significant way. 
It must be emphasized that only 3 stars common to 
both lists are necessary in order to find a successful 
match, but increasing n increases the chance of an un- 
usually shaped triangle being formed, which facilitates 
an early exit from the matching phase. 

2.2 OPM A 

2.2.1 Triangle Construction 

Triangles are constructed from the stars in both lists. 
Each set of 3 stars (triplet) may be matched in 6 dif- 
ferent ways with a triplet fr om the other list. Using an 
optimization introduced bv lGrothl (|1986| ). the number 
of candidates is reduced by a factor of 6 by assigning 
the vertices of the triangle such that vertices A and 
B define the shortest side, B and C the longest side, 
and A and C define the intermediate length side (see 
Figure [T}. This scheme generates 

T = n(n — l)(n — 2)/6 (1) 

unique triangles (T) from a list of n points. Next, 
we wish to assign so me metrics to each triangle to de- 
scribe its properties. iGrothl ljl986h used the ratio of the 
longest to the shortest side and the cosine of the angle 
at vertex B to define its posi t ion in a two-dimensional 
triangle space. IValdes et al.l (|l995l ) used the ratios of 
two sides, (^, f ) where a, b and c are the side lengths 
in decr easing order, to def ine its location in triangle 
space. iPal fc Bakosl <|2006t ) defined a more elaborate 
scheme based on the side lengths and some auxiliary 
quantities. Although more computationally complex, 
their definition preserves chirality and maps triangles 
using a continuous function, cleverly avoiding discon- 
tinuities where small measurement errors may result 
in triangles being mapped to different parts of triangle 
space. 

The OPMa algorithm defines triangle space as 
(x t , y t ), where 

x t = CB ■ CA, y t = - (2) 
c 

with CB and CA being the vectors from vertex C to 
B, and C to A respectively, and a/c is the ratio of the 
length of the longest to the shortest side (Figure [1} . 

A dot product, or cosine metric, is commonly used 
in text-based matching applicatio ns to compare the 
similarity of strings (see for example lRawat et al .120041 ). 
It has a number of useful properties, being stable un- 
der translation and rotation, and is computationally 
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Figure 1: OP Ma nomenclature. 



efficient to calculate using the relation: 

X-Y = \X\\Y\cose = J2x iyi . (3) 

i 

However, its primary advantage over the other rep- 
resentations is that it provides a scalar value that is a 
function of the lengths of the two vectors and the angle 
between them. Therefore, it is useful in discriminating 
between the set of triangles that share the same side 
length ratios, but with different perimeters. Such tri- 
angles map to the same locatio n in tri angl e space when 
using the definition of iGrothl (|!986h or IValdes et"all 
(1995). requiring additional algorithmic complexity to 
separate the false matches that they produce. 

OPM triangle space is sparse compared to that of 
IValdes et al.l (jl995l ). who compressed all triangles into 
the range (0 <x t < 1, <y t < 1), and iPal fc Bakosl 
(120061) who u sed a domain of (-1 <x t <1, -1 <yt <1)- 
IGrothl (1 19861 ) used a cosine of one of the angles, re- 
stricting < Xt < 1, and arbitrarily constrained y t < 
10. OPM' s definition permits an unconstrained range 
of values, thereby lowering the density (points per unit 
area) of triangle space, thus reducing the probability 
of misidentification. 

Figure [5] plots OP Ma triangle space for a repre- 
sentative image. Triangles formed from T and 7Z arc 
plotted using red pluses and green crosses respectively. 
A value of n = 25 was used, resulting in 2300 triangles 
in each list. Two interesting features are immediately 
apparent. Firstly, the vast majority of triangles oc- 
cur near the origin of the plot, where the density of 
points is greatest. Searches conducted in this region 
are very expensive due to the large number of candi- 
dates that must be considered. Secondly, a number of 
curving rows emanating from the origin and reaching 
up to large values of xt, and/or y t are visible. 

Each curve represents the set of triangles formed 
by a close pair of stars with a third more distant one. 
As the distance to the third star increases, the lengths 
of the two longest sides increase and the angle at vertex 
C becomes more acute, resulting in larger dot product. 
Similarly, the ratio of the longest to the shortest side 
increases. A key feature is that these curving rows are 
rather distinct, with the points furthest from the origin 
having very few neighbors. Processing the outlying 
points is very cost-effective due to the low number of 
candidates that must be considered. 

The plot also shows the T / 1Z pairings that were 
verified to be correct (blue pluses). Obviously, a few 
rows of image stars have no analogue extracted from 
the catalogue. This was caused by differences in rela- 
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Figure 2: OPM a triangle space. 



tive magnitude of the stars in the two lists, due primar- 
ily to passband disparities. Similarly, some 1Z have no 
matching I for the same reason. As expected, increas- 
ing the size of the 1Z list (to 55 in this case) results in 
matches for all 1. 

Triangle construction has a computational com- 
plexity of 0(n 3 ). Ho wever, by implemen ting an opti- 
mization proposed bv lValdes et al.l (|1995| ). that avoids 
calculating the same side length multiple times, the 
number of length calculations has been reduced from 
~ T 3 to ~ T 2 , with a proportional decrease in elapsed 
time. 

2.2.2 Matching Triangles 

Searching for matching triangles in triangle space is 
a combinatorial problem. In principle, all triangles 
generated from I and 1Z lists must be compared. A 
match is deemed to occur when a point in T triangle 
space is found to be within a certain tolerance e of a 
point in 1Z triangle space. 

A brute force method that compares each triplet of 
J stars to the entire list of 1Z triplets is an expensive op- 
eration of 0(n 6 ). However, by sorting the 7Z triangles 
by yt and using a binary search to find the starting 
point within the list, a large number of comparisons 
may be avoided. Only the points falling within yt ± e 
need be compared. The choice of limiting searches us- 
ing yt instead of xt is important, since it minimizes the 
number of candidates that fall within y t ± e, particu- 
larly when y t is large. The values in each coordinate 
are compared and a match is declared when they are 
within 2%, the tolerance having been determined em- 
pirically from test data. 

2.2.3 Early-Exit Strategy 

The OP Ma definition of triangle space ensures that 
triangles formed by two close vertices and a third more 
distant vertex map to sparse regions in triangle space, 
far from the densest areas occupied by triangles with 
similar side lengths. This property is exploited by 
searching the lowest density regions first, in the hope 
that a match will be found very quickly, allowing the 



4 



Publications of the Astronomical Society of Australia 



process to terminate before the higher density areas 
must be considered. Each T triangle is assigned a 
score defined as the product of xt and yt , and the list 
is sorted into descending order of score. Processing 
triangles in this order ensures that the relatively rare 
(highly selective) triangles are matched first. Com- 
parisons are inexpensive, because there are few similar 
1Z candidates, and the few candidates that are within 
range are likely to be true matches. 

2.2.4 Checking a Match 

All potential matches require verificatio n, since false 
matches will always be present. Vot ing (|Groth|[l986l ; 
IValdes et aLll 19951 : iPal fc Bakosll2006l ) makes use of an 
array to tally the number of times each pair of stars is 
involved in a potential match. This is a time consum- 
ing operation, since it requires all candidate triangles 
to be processed to allow each one an opportunity to 
vote. The likely matches are then selected on the ba- 
sis of probability — from the pairs that received the 
highest number of votes. 

By contrast, OPM assumes that highly selective 
triangles are likely to yield a true match, and if con- 
firmed, the search can be immediately terminated. There- 
fore, when a potential match is found, the algorithm 
immediately attempts to verify the relationship us- 
ing a light-weight (inexpensive) process. If unsuccess- 
ful, OPM continues processing more candidates, hav- 
ing expended little effort in screening out the false 
match. When the preliminary verification is positive, a 
more robust and relatively expensive verification pro- 
cess is used to comprehensively test the suitability of 
the match. It is assumed that this process will be ex- 
ecuted very few times, most likely only once. 

2.2.5 Preliminary Verification 

The preliminary verification (PV) process determines 
the transformation from image to sky coordinates us- 
ing an astrometric plate solution. It commences with 
the calculation of standard coordinates (£,7?), repre- 
senting the gnomonic projection of the spherical sky 
onto the plane of the detector, using the relations 

cos 5 sin(a — A) 

£ = (4) 

sin D sin 8 + cos D cos Scos(a — a) 

sin D cos S cos(a — A) — cos D sin S . 
sin D sin <5 + cos D cos 5cos(a — a) ' 

where (a, S) represent the equatorial coordinates of 
the catalogue stars and (^4, D) is the origin of the co- 
ordinates, which is usually taken as the approximate 
plate center. The standard coordinates are related to 
the measured coordinates (x,y) of the centroids of the 
stars on the image using the following relations: 

x 

f; - — = ax + by + c (6) 
Li 

rj - — = a' x + b'y + c , (7) 



where a,b,c,a',b',c' are the plate constants that de- 
scribe the translation and rotation necessary to trans- 
form between the two coordinate systems, and L is the 
focal length of the optics, expressed in the same units 
as x and y (|Marsderj| 19821 ). 

The candidate triangle relates three points on the 
image to three in the reference catalogue, and allows 
us to write six equations to solve the six unknown plate 
constants. As a check, we note that a ~ b' and b ~ —a' 
(|Edberdll983h . assuming that the axes are perpendic- 
ular and have the same scale, which should be the case 
if correct pairings have been selected. If the plate con- 
stants differ by more than 2.5%, a value determined 
empirically from test images, the candidate pairing is 
rejected. 

2.2.6 Final Verification 

If the solution appears to be reasonable, a more ro- 
bust final verification (FV) check is performed. Us- 
ing the initial plate solution, all I are transformed to 
equatorial coordinates and compared to the entire list 
of 1Z to find their closest match. An important op- 
timization speeds up this step by avoiding the need 
to compare all entries. An auxiliary array, contain- 
ing the indexes into the 1Z array, was prepared when 
the 7Z list was built initially. The auxiliary array was 
sorted by declination, allowing the 1Z array to remain 
sorted by magnitude. Using the auxiliary array, a bi- 
nary search is performed to locate the starting point 
within 1Z where comparisons should commence. The 
equatorial coordinates of each transformed X are com- 
pared to the catalogue coordinates of all 1Z that are 
within e arcsec. A tolerance of 3a is used, where a is 
the typical astrometric residual of a full plate solution 
at this image scale, thus allowing for uncertainties in 
the initial transformation which is based upon only 3 
stars, two of which are closely separated. 

A small angular separation approximation (|Meeusl 

Il99lh is used to estimate the separation of each pair 
of stars: 

s 2 = (AacosS) 2 + (AS) 2 , (8) 

where s is the separation in degrees, Act is their sep- 
aration in R.A., AS is their separation in declination, 
and 5 is the declination of the target I (with cos S cal- 
culated once outside the main loop). The approxima- 
tion avoids using transcendental functions, which are 
computationally expensive relative to ordinary floating 
point operations (addition, multiplication, division). 
Errors resulting from the approximation are absorbed 
by the relatively large value of e. The squared sep- 
aration, s 2 , is compared to e 2 to avoid a costly sqrt 
operation. 

Since the T array is sorted by relative magnitude, 
the brightest stars are compared first. If multiple 1Z 
are found within the matching tolerance, the bright- 
est, unassigned 1Z is used as the match. This is de- 
termined by simply saving the lowest 1Z index when a 
match occurs. Since the 1Z array is sorted by descend- 
ing magnitude, the saved index represents the bright- 
est TZ star. Once an assignment is made, the particular 
1Z is flagged to avoid matching it again. This scheme 
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ensures that the brightest X are matched to the bright- 
est 1Z when there are multiple candidates within the 
matching tolerance, mimicking the decision that a hu- 
man operator would have made. 

After all assignments have been completed, a new 
astrometric solution is calculated using all assigned 
pairs. The process iterates 3 times (this number is 
user controllable), successively refining the solution at 
each iteration as more stars are matched. At the end 
of the process, the final number of matched stars is 
compared to a predefined limit. If sufficient stars have 
been identified, the match is deemed to be correct and 
the search process terminates. In the unlikely event 
that insufficient stars have been identified, the search 
process continues with the next candidate. 

2.3 OP M B 

An alternative algorithm, named OP Mb, was devel- 
oped several years ago. I have since learne d that it 
bears some similarity to that described by iMurtaghl 
(1992). Nevertheless, my approach has some major 
differences, principally in its use of an early exit strat- 
egy and just-in-time approach that avoids calculating 
quantities until they are required. By postponing var- 
ious calculations, computational effort is saved in the 
hope that an early exit will render them unnecessary. 

OP Ma is dominated by triangle construction costs, 
particularly for large n. OPMb addresses this prob- 
lem by reducing the number of shapes to be character- 
ized. It also uses a more restrictive shape definition, 
which reduces the number of false positives that may 
occur and results in a successful match being found 
in nearly constant time, independent of n. Instead of 
matching triangles, an arbitrarily complex geometric 
shape, made up of a user defined number of points is 
used (Figure [3} . The shape to be matched is charac- 
terized by the relationship of the central star (A) with 
respect to the other stars (B, C, D, . . .), using their 
separations and position angles (PA) relative to star 
A. Angles are measured relative to north (defined as 
the —y direction as seen from star A) , although this is 
arbitrary. 



N 
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Figure 3: OPMb constructs shapes of arbitrary 
complexity using a user-defined number of points. 

Th is definition is similar to that used bv lMurtaghl 
(1992). although his world view describes the relation- 
ship of every star to its n—1 neighbors, requiring 0(n 2 ) 
calculations to describe all points. Furthermore, his 
world view is calculated for both the X and 1Z lists, 
with the matching process comparing all members of 



both sets to find a high confidence match, with a re- 
sulting computational complexity of Oin 2 ). Another 
point of difference is that Murtagh bins the position- 
angles into 1° increments in order to accommodate ro- 
tation of the coordinate systems, with his matching 
phase requiring the comparison of the world view of 
set A to 360 versions of set B. Although OPMb uses a 
superficially similar shape characterization, the algo- 
rithms are quite different. 

In principle, increasing the number of stars used to 
define the shape adds greater constraints and therefore 
reduces the number of false matches that may occur. 
It also allows more points to be used in the initial as- 
trometric solution, leading to a more accurate trans- 
formation. In practice, using 3 stars is sufficient be- 
cause the matching phase is very efficient relative to 
the shape characterization phase (analogous to triangle 
construction). The latter dominates the elapsed time 
of the search, even when false positives are present. 

OPMb processing commences with the lists of the 
n brightest X and 1Z, as described in Section 2.1. A 
sorted list of separations and PAs for each pair of stars 
in the 1Z list is constructed. The number of unique 
pairs, P, is given by 

P = n(n-l)/2. (9) 

This immediately provides an improvement over OP Ma , 
where two lists of triplets must be prepared instead of 
one list of pairs (P <T). 

The search process commences with the selection 
of the m brightest X, where m is the number of stars 
used to define the shape to be matched. Each star 
in the candidate list is assigned a letter, A being the 
first, B the second, and so on. The separation and 
PA of each pair, AB, AC, AD, . . . are calculated. The 
separations are computed from the focal length of the 
optical system and physical dimensions of the detec- 
tor. The PA is calculated relative to the top of the 
detector, since the absolute rotation relative to the ce- 
lestial sphere is unknown at this stage. Postponing 
the same calculations for X to the search phase avoids 
the effort of pre-calculating the entire list when only 
a few values may be required, as is the case when an 
early exit occurs. This just-in-time approach results 
in a considerable saving in computational effort. 

The search process attempts to find a match for 
AB in the list of 1Z pairs. A binary search is used 
to quickly identify those pairs with separations within 
the matching tolerance e. The difference in PA be- 
tween the image and reference pairs is assumed to be 
due to rotation of the detector. The 7Z list is now 
searched to find candidates for AC, AD, etc. using a 
binary search on separation and a knowledge of the 
rotational offset defined by AB with respect to the 
catalogue value. Candidate pairs with a rotational off- 
set >1° are rejected. A possible optimization, though 
not implemented, could reject candidate pairings not 
matching the absolute orientation of the detector with 
respect to the sky (when known), thus avoiding the 
need to determine the rotational offset and permitting 
incorrect pairs to be rejected immediately. 

Once m candidates have been identified, the pre- 
liminary verification function is called (see section !^. 2. 5|l . 
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The m pairings are used to determine an initial astro- 
metric solution which, if acceptable, may result in the 
final verification process being called. If the solution 
is rejected, or insufficient candidates are identified, the 
next candidate within the matching tolerance is se- 
lected and the process continues. If the current candi- 
date list cannot be matched, the search process begins 
again by selecting another set of candidate stars. The 
process repeats until a successful match is found, or the 
entire list of candidates is exhausted, in which case we 
declare that a match could not be found. 



3 Performance 

It is not possible to analytically determine the order 
of complexity of these algorithms because they do not 
perform a fixed number of searches. In the best case, 
a successful match may be found after processing just 
one candidate. In the worst case, the entire list of 
candidates may have to be searched. 

In order to investigate whether there is any cause 
for optimism, that is, whether an early match will oc- 
cur in practice with real data, 10 063 unfiltered, wide- 
field survey images acquired with a variety of SBIG 
detectors and focal-lengths were analyzed. Table [1] 
lists their characteristics, with the columns describing 
the focal-length (/), number of images, CCD detector, 
and effective FOV of each set of images. Although this 
sample of test images was acquired with SBIG detec- 
tors, the algorithms are generic in nature and apply 
equally to all CCD detectors. 

Fields were selected from an all-sky survey con- 
ducted from a latitude of 35°S. The deepest, widest 
fields, located near the galactic equator, contained ~ 
3.10 4 stellar sources to my ~ 15. Images contain- 
ing moderate defects such as blooming spikes, satel- 
lite trails, and thin cirrus were retained in the sample. 
Images that were heavily obscured by cloud were dis- 
carded. In order to test algorithmic robustness under 
a variety of conditions, approximately 40% of the im- 
ages were taken from a photometric survey of bright 
stars that were strongly defocused to avoid saturation. 



that this effect has been averaged out over the timescale 
of the test and that each test was affected equally. 

Separate timers were used to measure the perfor- 
mance of each of the following phases: triangle (pair) 
construction, sorting, searching for candidates, prelim- 
inary verification, and final verification. In the follow- 
ing discussion, the term matching refers to the com- 
bined efforts of searching, preliminary verification and 
final verification. Although other algorithms do not 
consider calculation of the transformation (as performed 
by final verification) to be part of the matching pro- 
cess, it is necessary to include this for OPM, since 
we must be certain that an early exit is warranted. To 
avoid unfairly penalizing search performance, final ver- 
ification was configured to use a maximum 100 image 
stars. 

Results for the two algorithms are summarized in 
Tables [2] & [3] with the columns describing list size 
(n), total elapsed time, elapsed time for the triangle 
(pair) construction phase, elapsed time for the match- 
ing phase, and the percentage of images successfully 
matched. Figures & HU plot the relative construction 
and matching costs. Figures [5] & [7] show a break-down 
of the matching phase for each algorithm. Note that 
the plots use the same vertical scale for easy compari- 
son, and that the abscissa for the OPMb plots extend 
to n = 200. 



3.1 OPM a Performance 



Table 2: OPMa performance 



n 


Total Elapsed 


Construct 


Match 


Match 




(ms) 


(ms) 


(ms) 


% 


10 


3.99 ± 0.71 


0.23 ± 0.01 


3.34 ± 0.67 


90.55 


20 


6.16 ± 1.02 


2.13 ± 0.14 


3.53 ± 1.00 


99.92 


30 


12.14 ± 1.75 


7.83 ± 0.19 


3.60 ± 1.74 


100.00 


40 


24.43 ± 1.36 


19.74 ± 0.51 


3.61 ± 1.26 


100.00 


50 


46.11 ± 1.60 


40.72 ± 0.79 


3.75 ± 1.38 


100.00 


(ill 


79.63 ± 2.16 


73.20 ± 0.92 


3.96 ± 1.97 


100.00 


70 


128.02 ± 3.49 


120.14 ± 1.38 


4.30 ± 3.24 


100.00 


80 


194.07 ± 4.72 


184.33 ± 2.85 


4.66 ± 3.77 


100.00 


90 


280.84 ± 6.25 


268.64 ± 3.73 


5.25 ± 5.04 


100.00 


100 


391.28 ± 7.84 


376.01 ± 4.00 


6.00 ± 6.80 


100.00 





Tabic 1: 


Test Ima£ 


;es 


/ (mm) 


Number 


Detector 


FOV (deg) 


102 


2498 


ST-6 


4.8 x 3.6 


135 


211 


ST-6 


3.6 x 2.8 


180 a 


3943 


ST-8XE 


2.9 x 1.9 


180 


1234 


ST-8XE 


4.4 x 2.9 


188 


1102 


ST-8XE 


4.2 x 2.8 


200 


1075 


ST-8XE 


4.0 x 2.6 



"sub- frame 



Elapsed times were measured with the Pentium 
performance counter (RDTSC instruction), that re- 
ports the number of clock cycles that have occurred 
since the CPU was powered up. Despite its high res- 
olution, precision is limited by unavoidable context 
switches within the operating system. It is assumed 




10 15 20 25 30 40 50 60 70 80 90 100 
List size (n) 



Figure 4: OPMa triangle construction cost rela- 
tive to the matching phase. 
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Final Verification 
Preliminary Verification 
Search 




10 15 20 25 30 40 50 60 70 80 90 100 
List size fn) 

Figure 5: OPM a matching phase only. 



The following performance characteristics are ob- 
served: All values of n > 30 resulted in a 100% match 
rate. Large lists were unnecessary and were in fact 
detrimental, increasing triangle generation times. Even 
a small value of n = 30 was sufficient to generate a 
number of highly selective triangles allowing a match 
to be found quickly. Values less than 30 did not suc- 
ceed in matching all images, although somewhat sur- 
prisingly, even at n = 10, over 90% of the images were 
matched successfully. 

Figure [4] plots the cost of the matching phase rel- 
ative to triangle construction. At small n, triangle 
construction costs are negligible and matching domi- 
nates (with the majority apportioned to the final ver- 
ification phase) . As n increases, the triangle construc- 
tion time quickly starts to dominate matching costs, 
the latter being nearly constant. That triangle con- 
struction dominated the total time is in comple te con- 
trast t o the performance statistic s publ ished bv lGrothl 
(| 19861 ) and iMarszalek fc Rokital (|2004f ), where trian- 
gle construction was the fast operation and match- 
ing dominated. Realizing that triangle construction 
costs should be similar for all equally optimized algo- 
rithms further highlights the effectiveness of the early 
exit strategy. 

The elapsed time in the search and preliminary 
verification phases is small relative to final verification, 
confirming that they are suitably light (Figure^. The 
cost of final verification could be further reduced by 
limiting the number of iterations that are performed 
(3 by default). One could conceivably stop iterating 
once a sufficient number of stars have been identified, 
although this optimization was not implemented. 

There is very little scatter in total elapsed time, 
confirming that fast matches, leading to early exits, 
occur consistently. The median number of candidates 
processed from the T triangle list is very small is ab- 
solute terms. Less than 1.5% of T were examined for 
n = 20, reducing to 0.05% T for n = 100. 

A value of n = 30 appears to be optimal; large 
enough to produce reliable results and small enough to 
limit triangle construction and matching costs. While 
it is impressive that the entire process can be com- 
pleted successfully in ~ 12 ms, it is equally remark- 
able that the cost of searching a much larger list (n — 
100) is not prohibitive. This is possible because only a 



small subset of the triangles is searched instead of pro- 
cessing all combinations. Nevertheless, large lists offer 
no practical advantage, particularly when smaller lists 
are completely reliable. 



3.2 OPM B Performance 

OPMb tests were conducted with m = 3 in order 
to directly compare the performance to OPM a- It 
was found that absolute search performance was faster 
than OPM a- For small values of n, both algorithms 
provide similar performance, due to the relatively large 
cost of final verification. At n — 30, OPMb is twice 
as fast as OP Ma, due primarily to savings in the con- 
struction phase. By n = 100, OPMb is an order of 
magnitude faster than OP Ma- 



Table 3: OPMb performance 



Total Elapsed 
( ms ) 



Construct 
( ms ) 



Match 

( ms ) 



Match 

% 



10 


3.56 ± 0.35 


0.14 ± 0.03 


3.32 


± 0.32 


87.24 


20 


4.09 ± 0.51 


0.57 ± 0.03 


3.42 


± 0.51 


99.54 


30 


5.04 ± 1.06 


1.35 ± 0.05 


3.58 


± 1.06 


99.97 


40 


6.39 ± 1.91 


2.53 ± 0.16 


3.75 


± 1.90 


100.00 


50 


8.19 ± 2.88 


4.17 ± 0.13 


3.91 


± 2.87 


100.00 


75 


15.41 ± 4.45 


11.19 ± 0.28 


4.11 


± 4.43 


100.00 


100 


29.09 ± 5.12 


24.75 ± 0.59 


4.22 


± 5.06 


100.00 


150 


111.17 ± 7.69 


106.46 ± 2.71 


4.56 


± 7.18 


100.00 


200 


360.06 ± 12.69 


354.98 ± 8.86 


4.91 


± 9.10 


100.00 



Matching Phase 
Pair Construction 




10 15 20 25 30 40 50 75 100 150 200 250 
List size (n) 

Figure 6: OPMb pair construction and matching 
phases. 



Matching time increased by just 1.6 ms as list size 
increased from 10 to 200 points. This is attributable 
to the fact that very few candidates were examined 
to find a successful match, even for large lists. For 
n = 100, 60% of searches were solved using the first 
candidate list and 90% of searches were completed by 
testing < 10 candidate lists. 

Figure [7] plots the time spent in the sub-phases of 
matching as a function of n. The search time increased 
by <1 ms between 10 < n < 200, and PV costs were 
insignificant, due to the use of PA in shape character- 
ization, which removes candidates with incorrect chi- 
rality. Final verification accounted for the majority of 
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Final Verification 
Preliminary Verification 
Search 




10 15 20 25 30 40 50 75 100 150 200 250 
List size (n) 

Figure 7: OPM b matching phase only. 



the time. I expect that further optimizations in the fi- 
nal verification phase might reasonably yield matching 
times of approximately 1-2 ms. 

Figure [8] plots the total elapsed time of the search 
as a function of n, for both OP Ma and OP Mb- Also 
shown is the time spent in each matching phase, high- 
lighting the nearly constant matching time of OP Mb ■ 




o 1 1 1 1 1 

50 100 150 200 

List size (n) 



Figure 8: Total time vs matching phase. 



final verification. Even allowing for an ~20% differ- 
ence in processor speed, it is clear that early exits are 
extremely beneficial, with the performance differential 
expected to widen as n increases. 

3.4 Ill-conditioned Searches 

The preceding tests were performed on wide-field im- 
ages where pointing errors were small relative to the 
size of the FOV. Thus, there was nearly a 100% overlap 
between X and 1Z. We now consider the performance 
of OPMb under non-optimal conditions. 

Figure [9] plots the match rate and elapsed time of 
10 063 searches (n — 100) when the brightest stars 
have been omitted from 1, as might be the case if 
they were saturated or a significant passband disparity 
exists. A 100% match rate was maintained even when 
skipping 20% of T, dropping slightly to 99.7% at 30% 
of I. The elapsed time was only marginally affected 
for values up to 30%, but did increase markedly when 
a significant fraction of stars were skipped because the 
mismatched lists reduced the likelihood of an early exit 
being taken. Nevertheless, skipping (an unrealistic) 
30% of stars did not significantly affect reliability or 
performance. 



Reliability Performance 




10 20 30 40 50 10 20 30 40 50 

% skipped % skipped 



3.3 Relative Performance 

A number of authors have provided indicative perfor- 
mance measurements for their respective implementa- 
tions. Unfortunately, absolute timings are difficult to 
compare because they are quoted for different values 
of n, statistics are not provided for all phases, and dif- 
ferences in machine architecture and processor speed 
play a significant role in determining the overall per- 
formance. Nevertheless, it is possible to make some 
general observations by comparing recent results pro- 
duced on a simila r CPU. 

Most recently. iPal fc Bakosl (|2006l ) demonstrated a 
mean elapsed time of ~ 100ms to process a full-triangulation 
of 35 sources using their grmatch task, which im- 
plements a voting algorithm (2.0 GHz 64-bit AMD 
Opteron CPU). The time quoted for grmatch excluded 
iterative calculation and refinement of the transforma- 
tion coefficients. Table [3] shows that OPMb completes 
the same task in ~6 ms, including the extra work of 



Figure 9: OPMb reliability and performance 
when skipping the brightest X stars. 

Figure [10] plots OPMb performance for partially 
overlapping fields. Scenarios where X and 1Z are not 
aligned are more typical of narrow-field images, where 
pointing errors may be a significant fraction of the 
FOV. Curves are plotted for two values of n. A value 
of n = 100 was slightly more reliable than n = 40, but 
the latter performed far better as the degree of over- 
lap decreased. Under these conditions, smaller values 
of n are favored to avoid long search times when the 
chance of finding a successful match is small. If the 
coordinates of the field center are unknown, an iter- 
ative (perhaps spiral) search should use a small n to 
reduce the elapsed time of any unsuccessful (exhaus- 
tive) searches. 

Extremely narrow fields of view were simulated by 
conducting tests using very few T stars. Figure [TT] 
plots reliability and performance when constraining 
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Figure 10: OPMb reliability and performance for 
partially overlapped fields. 



3 < T < 10. Only the well-focused subset of 6f20 
images was used, so that spurious stellar detections 
from blended defocused objects would not be included 
within I, which would otherwise skew results. The 
size of the reference list was set to 1Z = 50 to re- 
duce the chance of passband disparities producing non- 
overlapping lists, which is more likely when both 1 and 
1Z axe small. The plot shows that as few as 5 stars were 
sufficient to successfully match 93.4% of cases, rising 
to 100% at X = 10. This is in contrast to the results 
shown in Table [3] for n < 30, which were less reliable 
because both lists were small, resulting in a reduced 
match rate. The combination of T — 10, 1Z — 50 pro- 
vides both high reliability and good performance, with 
searches completing in ~ 8 ms. 



Reliability 



Performance 



100 




2 4 6 8 10 
I stars 




Figure 11: OPMb reliability and performance for 
fields containing very few stars (3 < X < 10 and 
K = 50). 



sented. The matching phase of OPMb is nearly O(l), 
being independent of list size. These algorithms have a 
significant performance advantage over previous tech- 
niques, at a slight loss in generality, caused by the 
requirement that the approximate focal length of the 
optical system is known a priori. This requirement 
permits the determination of the image scale from the 
physical dimensions of the detector, allowing OPM al- 
gorithms to directly compare a subset of triangles (or 
shapes) to their counterparts derived from a reference 
catalogue, without having to process the entire set, as 
is the case when the scale is unknown. By employing 
early exit strategies, postponing work until absolutely 
necessary, testing candidates in the order most likely 
to yield success, and combining these with and an effi- 
cient mechanism for rejecting false positives, a highly 
efficient search, in nearly constant time is possible. 

Small uncertainties in the focal length, such as 
caused by temperature related changes, are accommo- 
dated by selecting an appropriate matching tolerance. 
The actual focal-length is determined and reported as 
part of the astrometric solution. 

The OPM algorithms are particularly suited to 
processing large lists or in situations where pattern 
matching must be performed as quickly as possible. 
The performance of these algorithms makes it prac- 
tical to search thousands of fields very quickly, if for 
example, the coordinates of the field center were un- 
known. Similarly, when only an approximate focal- 
length is known, it is perfectly reasonable to attempt 
to iteratively match the field using a range of focal- 
lengths. 
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